Goto

Collaborating Authors

 incomplete panel


A Robust Functional EM Algorithm for Incomplete Panel Count Data

Neural Information Processing Systems

Panel count data describes aggregated counts of recurrent events observed at discrete time points. To understand dynamics of health behaviors and predict future negative events, the field of quantitative behavioral research has evolved to increasingly rely upon panel count data collected via multiple self reports, for example, about frequencies of smoking using in-the-moment surveys on mobile devices. However, missing reports are common and present a major barrier to downstream statistical learning. As a first step, under a missing completely at random assumption (MCAR), we propose a simple yet widely applicable functional EM algorithm to estimate the counting process mean function, which is of central interest to behavioral scientists. The proposed approach wraps several popular panel count inference methods, seamlessly deals with incomplete counts and is robust to misspecification of the Poisson process assumption. Theoretical analysis of the proposed algorithm provides finite-sample guarantees by extending parametric EM theory to the general non-parametric setting. We illustrate the utility of the proposed algorithm through numerical experiments and an analysis of smoking cessation data. We also discuss useful extensions to address deviations from the MCAR assumption and covariate effects.


Review for NeurIPS paper: A Robust Functional EM Algorithm for Incomplete Panel Count Data

Neural Information Processing Systems

The reviewers all agree that this paper represents a contribution to theory and methods for missing data but note a few limitations. The MCAR assumption is a strong one, but the authors address this in their rebuttal and hopefully more directly in the revised paper. Similarly, it appears they've added baseline performance metrics and other comparisons as requested.


Review for NeurIPS paper: A Robust Functional EM Algorithm for Incomplete Panel Count Data

Neural Information Processing Systems

Weaknesses: - The MCAR assumption is difficult to justify in practice. This is good, however, could the authors clarify some of the following points regarding their method in the context of MCAR missingness. By definition, MCAR implies that one can simply ignore any rows of data containing missingness and restricting the analysis to so called "complete cases" will still result in unbiased estimates of the parameter of interest. In light of this, and the bounds on \epsilon implying that there will always be complete cases in the data as n - \infty (if this were not true, the parameters of interest would not be identifiable) what is the advantage of the proposed EM algorithm over simply doing complete case analysis and using some of the older tools cited in the paper that can be run on complete data. I apologize if I missed this, but it doesn't seem like there's a baseline comparison to such a complete case analysis or to the alternative of directly maximizing the observed data likelihood by integrating according to patterns of missingness.


A Robust Functional EM Algorithm for Incomplete Panel Count Data

Neural Information Processing Systems

Panel count data describes aggregated counts of recurrent events observed at discrete time points. To understand dynamics of health behaviors and predict future negative events, the field of quantitative behavioral research has evolved to increasingly rely upon panel count data collected via multiple self reports, for example, about frequencies of smoking using in-the-moment surveys on mobile devices. However, missing reports are common and present a major barrier to downstream statistical learning. As a first step, under a missing completely at random assumption (MCAR), we propose a simple yet widely applicable functional EM algorithm to estimate the counting process mean function, which is of central interest to behavioral scientists. The proposed approach wraps several popular panel count inference methods, seamlessly deals with incomplete counts and is robust to misspecification of the Poisson process assumption.